refactor: use OpenAiApi directly to return OpenAI Chat Completions format#6341
refactor: use OpenAiApi directly to return OpenAI Chat Completions format#6341eye-gu wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the AI proxy plugin to call OpenAI-compatible upstream APIs directly through OpenAiApi, returning OpenAI Chat Completions-style responses instead of Spring AI ChatResponse objects.
Changes:
- Adds request adaptation, direct OpenAI API execution, SSE chunk output, and upstream error logging.
- Replaces ChatClient-based caching/fallback paths with OpenAiApi-based logic.
- Updates tests for the new direct-call and protocol-adapter behavior.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
AiProxyPluginConfiguration.java |
Wires the refactored plugin, handler, executor, and API key subscriber. |
CommonAiProxyApiKeyDataSubscriberTest.java |
Updates subscriber tests for constructor changes. |
UpstreamErrorLoggerTest.java |
Adds tests for upstream error logging. |
AiProxyExecutorServiceTest.java |
Updates executor tests for direct OpenAiApi calls. |
AiProxyPluginHandlerTest.java |
Adds handler cache/removal tests. |
AiProxyPluginTest.java |
Updates plugin tests for direct OpenAiApi execution. |
CommonAiProxyApiKeyDataSubscriber.java |
Clears OpenAiApi cache on API key refresh. |
UpstreamErrorLogger.java |
Adds shared upstream WebClient error logging. |
AiProxyExecutorService.java |
Implements direct streaming/non-streaming OpenAiApi execution with retry/fallback. |
AiProxyPluginHandler.java |
Switches invalidation to OpenAiApiCache and removes selector API keys on delete. |
OpenAiApiCache.java |
Adds OpenAiApi instance cache and invalidation helpers. |
AiProxyPlugin.java |
Builds OpenAiApi clients, adapts requests, and writes OpenAI-format responses. |
OpenAiProtocolAdapterTest.java |
Adds tests for stream resolution and request-field fallback behavior. |
SimpleModelFallbackStrategy.java |
Removes ChatClient fallback strategy. |
FallbackStrategy.java |
Removes fallback strategy interface. |
OpenAiProtocolAdapter.java |
Adds raw OpenAI request parsing and config fallback merging. |
AiCommonConfig.java |
Makes default temperature unset instead of 0.8. |
Comments suppressed due to low confidence (1)
shenyu-plugin/shenyu-plugin-ai/shenyu-plugin-ai-proxy/src/main/java/org/apache/shenyu/plugin/ai/proxy/enhanced/cache/OpenAiApiCache.java:102
- This comment says the cache evicts the oldest entries, but the implementation iterates a ConcurrentHashMap, which has no insertion/access ordering and evicts arbitrary entries. Update the comment or use an ordered cache if oldest-entry eviction is required.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| final ObjectNode mutableRoot = (ObjectNode) root; | ||
|
|
||
| if (root.hasNonNull("max_completion_tokens") && !root.hasNonNull("max_tokens")) { |
There was a problem hiding this comment.
not use magic value, pls add openai constant value
|
|
||
| final ChatCompletionRequest result = JsonUtils.jsonToObject(mutableRoot.toString(), ChatCompletionRequest.class); | ||
| if (Objects.isNull(result)) { | ||
| throw new IllegalArgumentException("Failed to parse request body into ChatCompletionRequest"); |
There was a problem hiding this comment.
throw new ShenyuException(value)
| final String requestBody, final boolean stream) { | ||
| return mainApi.chatCompletionStream(request) | ||
| .doOnError(e -> UpstreamErrorLogger.logUpstreamError(LOG, e, "direct stream")) | ||
| .retryWhen(Retry.max(1) |
There was a problem hiding this comment.
@moremind This is the original behavior — stream requests are limited to a single retry. I guess the rationale is that streaming responses are typically long-running; allowing multiple retries would accumulate excessive latency, making the overall response time unacceptable. The isRetryable filter added in this PR is an improvement over the original logic, but the max(1) cap remains intentionally unchanged.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.
Comments suppressed due to low confidence (1)
shenyu-plugin/shenyu-plugin-ai/shenyu-plugin-ai-proxy/src/main/java/org/apache/shenyu/plugin/ai/proxy/enhanced/cache/OpenAiApiCache.java:96
- This new cache has eviction, prefix invalidation, and singleton clearing behavior, but there are no unit tests for it while the neighboring AiProxyApiKeyCache has dedicated cache tests. Add coverage for computeIfAbsent reuse, remove(selectorId), clearAll, and eviction boundaries to catch stale-client regressions.
close #6340
Refactors the
ai-proxyplugin to call upstream providers viaOpenAiApidirectly instead of Spring AI'sChatClient, ensuring responses conform to the standard OpenAI Chat Completions API format.Changes
OpenAiProtocolAdapter— parses raw request JSON intoChatCompletionRequest, preserving all fields (includingreasoning_content) that Spring AI'screateRequest()would lose. Also resolvesstreamflag from client request with admin config as fallback, and convertsmax_completion_tokens→max_tokensfor compatibility.UpstreamErrorLogger— shared utility to extractWebClientResponseExceptiondetails for upstream error logging.ChatClientCache— no longer needed sinceOpenAiApiinstances are lightweight and stateless.FallbackStrategy/SimpleModelFallbackStrategy— fallback is now handled inline withOpenAiApidirectly.AiModelFactoryRegistrydependency from the plugin —OpenAiApiis constructed directly fromAiCommonConfig(baseUrl + apiKey).ChatCompletionChunkSSE events +data: [DONE]terminator.ChatCompletionJSON directly.Retry.backoff(3, 1s)instead ofRetry.max(1).Testing
OpenAiProtocolAdapterTest— covers stream resolution, request parsing, field preservation, and fallback config merging.AiProxyPluginTest,AiProxyExecutorServiceTest,CommonAiProxyApiKeyDataSubscriberTestto match the new API.Make sure that:
./mvnw clean install -Dmaven.javadoc.skip=true.